Optimized Execution of Action Chains through Subgoal Refinement∗

نویسندگان

  • Freek Stulp
  • Michael Beetz
چکیده

In this paper we propose a novel computation model for the execution of abstract action chains. In this computation model a robot first learns situation-specific performance models of abstract actions. It then uses these models to automatically specialize the abstract actions for their execution in a given action chain. This specialization results in refined chains that are optimized for performance. As a side effect this behavior optimization also appears to produce action chains with seamless transitions between actions. Introduction Many plan-based autonomous robot controllers generate chains of abstract actions in order to achieve complex, dynamically changing, and possibly interacting goals. To allow for plan-based control, the plan generation mechanisms are equipped with libraries of actions and causal models of these actions, specifying what it can achieve, and under which circumstances. By specifying these actions abstractly, they apply to a broad range of situations, reducing the search space for planning. The advantages of this abstraction, however, come at a cost. Because planning systems consider actions as black boxes with performance independent of the prior and subsequent steps, the system cannot tailor the actions to the contexts of their execution. This often yields suboptimal behavior with abrupt transitions between actions, causing suboptimal performance. The resulting motion patterns are so characteristic for robots that people trying to imitate robotic behavior will do so by making abrupt movements between actions. Let us illustrate these points using the autonomous robot soccer scenario depicted in Figure 1. To solve this task, the planner issues a three step plan, also shown in the figure. If the robot naively executes the first action (sub-figure 1b), it might arrive at the ball with the goal at its back, an unfortunate position from which to start dribbling towards the goal. The problem is that in the abstract view of the planner, being at the ball is considered sufficient for dribbling the ball and the dynamical state of the robot arriving at the ball is considered to be irrelevant for the dribbling action. What we would like the robot to do instead is to go to the ball in order to dribble it towards the goal afterwards. The robot ∗The work described in this paper was partially funded by the Deutsche Forschungsgemeinschaft in the SPP-1125. should, as depicted in the sub-figure 1c, perform the first action sub-optimally in order to achieve a much better position for executing the second plan step. Goal: Score! Plan: − go to ball − dribble ball in order to Plan: − go to ball − dribble ball − shoot − shoot a) b) c) Figure 1: Alternative executions of the same plan In this paper we propose a novel computational model for plan execution that enables the planner to keep its abstract action models and that optimizes action chains at execution time, shown in Figure 2. The basic idea of our approach is to learn performance models of abstract actions off-line from observed experience. Then at execution time, our system determines the set of parameters that are not set by the plan and therefore define the possible action executions. It then computes for each abstract action the parameterization such that the predicted performance of the action chain is optimal. This is done by refining the intermediate state between subsequent actions. Generate Action Chain Subgoal refinement Refined (optimal) subgoal Learn Performance Model Interm. state Act.i+1 Initial state Goal state Act.i Action O ff −l in e E xe cu tio n tim e Perf.Model

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tailoring Robot Actions to Task Contexts using Action Models

In motor control, high-level goals must be expressed in terms of low-level motor commands. An effective approach to bridge this gap, widespread in both nature and robotics, is to acquire a set of temporally extended actions, each designed for specific goals and task contexts. An action selection module then selects the appropriate action in a given situation. In this approach, high-level goals ...

متن کامل

Scheduling Parallel Execution of Planning and Action for a Hierarchically-Decomposable Planning Problem

This paper proposes a novel method to schedule parallel execution of planning and action. The method is for a class of planning problems which are hierarchically decomposed into two subproblems: (1) determining the next subgoal and (2) determining and executing an action sequence to achieve the subgoal. In this problem class, the upper-level planning process can be viewed as a process of gradua...

متن کامل

Optimized Execution of Action Chains Using Learned Performance Models of Abstract Actions

Many plan-based autonomous robot controllers generate chains of abstract actions in order to achieve complex, dynamically changing, and possibly interacting goals. The execution of these action chains often results in robot behavior that shows abrupt transitions between subsequent actions, causing suboptimal performance. The resulting motion patterns are so characteristic for robots that people...

متن کامل

Subgoal chaining and the local minimum problem

It is well known that performing gradient descent on fixed surfaces may result in poor travel through getting stuck in local minima and other surface features. Subgoal chaining in supervised learning is a method to improve travel for neural networks by directing local variation in the surface during training. This paper shows however that linear subgoal chains such as those used in ERA are not ...

متن کامل

Microsoft Word - era978

It is well known that performing gradient descent on fixed surfaces may result in poor travel through getting stuck in local minima and other surface features. Subgoal chaining in supervised learning is a method to improve travel for neural networks by directing local variation in the surface during training. This paper shows however that linear subgoal chains such as those used in ERA are not ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005